AITopics | modeling tabular data

Collaborating Authors

modeling tabular data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Modeling Tabular data using Conditional GAN

Neural Information Processing SystemsDec-25-2025, 03:50:39 GMT

Modeling the probability distribution of rows in tabular data and generating realistic synthetic data is a non-trivial task. Tabular data usually contains a mix of discrete and continuous columns. Continuous columns may have multiple modes whereas discrete columns are sometimes imbalanced making the modeling difficult. Existing statistical and deep neural network models fail to properly model this type of data. We design CTGAN, which uses a conditional generative adversarial network to address these challenges. To aid in a fair and thorough comparison, we design a benchmark with 7 simulated and 8 real datasets and several Bayesian network baselines. CTGAN outperforms Bayesian methods on most of the real datasets whereas other deep learning methods could not.

conditional gan, modeling tabular data, name change, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.62)

Add feedback

Reviews: Modeling Tabular data using Conditional GAN

Neural Information Processing SystemsJan-22-2025, 09:42:00 GMT

Originality: The main originality of the paper is a data transformation process applied to tabular data so a GAN can learn from them. This is definitely higher novel and can be potentially useful in similar situations involving such distributions. Apart from this, however, I feel that the authors are overclaiming a bit regarding several challenge/contributions: -C2 (L86): The choice of activation function certainly depends on the data format, listing that as a "challenge" seems a bit too much to me, unless the authors can point out non-trivial adaptations they made to address the problem (and apologize if I missed that...) -C4 (L98): again, hardly something new -C5 (L105): mode collapse is certainly well studied in literature (speaking of which, the authors should add references on newer approaches such as BourGAN), using an off-the-shelf solution (PacGAN), again, does not seem to me as an important contribution. Rephrasing the section and focus on the important contributions (C3, and perhaps C1) will make the contributions of the paper more clear, in my opinion. Quality: The paper is of high quality and the description of techniques is sound.

architecture, contribution, justification, (13 more...)

Neural Information Processing Systems

Genre: Personal (0.57)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Add feedback

Modeling Tabular data using Conditional GAN

Neural Information Processing SystemsOct-9-2024, 16:56:49 GMT

conditional gan, modeling tabular data

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Comparative Analysis of Transformers for Modeling Tabular Data: A Casestudy using Industry Scale Dataset

Singh, Usneek, Arora, Piyush, Ganesan, Shamika, Kumar, Mohit, Kulkarni, Siddhant, Joshi, Salil R.

arXiv.org Artificial IntelligenceNov-24-2023

We perform a comparative analysis of transformer-based models designed for modeling tabular data, specifically on an industry-scale dataset. While earlier studies demonstrated promising outcomes on smaller public or synthetic datasets, the effectiveness did not extend to larger industry-scale datasets. The challenges identified include handling high-dimensional data, the necessity for efficient pre-processing of categorical and numerical features, and addressing substantial computational requirements. To overcome the identified challenges, the study conducts an extensive examination of various transformer-based models using both synthetic datasets and the default prediction Kaggle dataset (2022) from American Express. The paper presents crucial insights into optimal data pre-processing, compares pre-training and direct supervised learning methods, discusses strategies for managing categorical and numerical features, and highlights trade-offs between computational resources and performance. Focusing on temporal financial data modeling, the research aims to facilitate the systematic development and deployment of transformer-based models in real-world scenarios, emphasizing scalability.

architecture, dataset, transformer, (15 more...)

arXiv.org Artificial Intelligence

2311.14335

Country:

Asia > India > Karnataka > Bengaluru (0.05)
Asia > China > Beijing > Beijing (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PTab: Using the Pre-trained Language Model for Modeling Tabular Data

Liu, Guang, Yang, Jie, Wu, Ledell

arXiv.org Artificial IntelligenceSep-15-2022

Tabular data is the foundation of the information age and has been extensively studied. Recent studies show that neural-based models are effective in learning contextual representation for tabular data. The learning of an effective contextual representation requires meaningful features and a large amount of data. However, current methods often fail to properly learn a contextual representation from the features without semantic information. In addition, it's intractable to enlarge the training set through mixed tabular datasets due to the difference between datasets. To address these problems, we propose a novel framework PTab, using the Pre-trained language model to model Tabular data. PTab learns a contextual representation of tabular data through a three-stage processing: Modality Transformation(MT), Masked-Language Fine-tuning(MF), and Classification Fine-tuning(CF). We initialize our model with a pre-trained Model (PTM) which contains semantic information learned from the large-scale language data. Consequently, contextual representation can be learned effectively during the fine-tuning stages. In addition, we can naturally mix the textualized tabular data to enlarge the training set to further improve representation learning. We evaluate PTab on eight popular tabular classification datasets. Experimental results show that our method has achieved a better average AUC score in supervised settings compared to the state-of-the-art baselines(e.g. XGBoost), and outperforms counterpart methods under semi-supervised settings. We present visualization results that show PTab has well instance-based interpretability.

artificial intelligence, machine learning, pre-trained language model, (2 more...)

arXiv.org Artificial Intelligence

2209.0806

Genre: Research Report > New Finding (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Modeling Tabular data using Conditional GAN

Xu, Lei, Skoularidou, Maria, Cuesta-Infante, Alfredo, Veeramachaneni, Kalyan

Neural Information Processing SystemsMar-18-2020, 23:32:09 GMT

conditional gan, modeling tabular data

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback